Detecting Depression States Based on Sensor Data

Data Science Pipeline Tutorial

By Anastasia Kortjohn



Introduction

Depression is a mood disorder that affects more than 264 million people globally, making it one of the leading causes of disability worldwide. It's characterized by symptoms such as profound sadness, the feeling of emptiness, anxiety, sleep disturbance, as well as a general loss of initiative and interest in activities. The severity of a depression is determined by the quantity of symptoms, their seriousness and duration, as well as the consequences on social and occupational function.

One way to classify depression is unipolar and bipolar; unipolar depression refers to major depressive disorder and bipolar depression is a facet of bipolar disorder. They are both genetic mood disorders and share symptoms, but a distincion should be made between the two: bipolar depression is unique in the periodic occurrence of mania, a state associated with inflated self-esteem, impulsivity, increased activity, goal-directed actions, and reduced sleep.

Although there are known, effective treatments for mental disorders, between 76% and 85% of people in low- and middle-income countries receive no treatment for their disorder. One barrier to effective care is inaccurate assessment: in countries of all income levels, people who are depressed are often not correctly diagnosed.

How does sensor data come into play?

Actigraphs are small motion sensor detectors (accelerometers) that are encased in a unit about the size of a wristwatch, and can be worn continuously for days to months. It is well established that depression is characterized by altered motor activity, and actigraph recordings of motor activity are considered an objective method for observing depression. Despite not being exhaustively studied yet, there is an increasing awareness in the field of psychiatry on how the activity data relates to various mental health issues such as changes in mood, personality, inability to cope with daily problems, or stress and withdrawal from friends and activities.

In the following tutorial, we will walk through the Data Science Pipeline to see if depression states can be accurately predicted through the sensor data recorded by Actigraphs.


References

The Dataset

We'll be looking at Actigraphic data originally collected for a study on motor activity in schizophrenia and major depression. Actigraphs continuously record an activity count proportional to the intensity of movement in one minute intervals. The dataset consists of actigraphy data collected for the condition group (23 unipolar and and bipolar depressed patients) as well as the control group (32 non-depressed contributors). We'll be using the The Montgomery-Asberg Depression Rating Scale (MADRS) score included in the data for each participant to identify the severity of an ongoing depression. The score is based on ten items relevant for depression, which clinicians rate based on observation and conversation with the patient. The sum score (0-60) represents the severity: scores below 10 are classified as an absence of depressive symptoms, and scores above 30 indicate a severe depressive state.


References

Getting started

First let's import the libraries we'll use throughout the tutorial.

Collecting and Curating the Data

The dataset contains:

Scores Data

To get the data for the control and condition groups from scores.csv, we use pandas to read the csv file and store the data in a DataFrame. I have the scores.csv in a folder called 'data', so I use the relative path data/scores.csv to access it.

Managing and Representing the Scores Data

Right away we can see there is a lot of missing data, mostly comprised of the depression data for the control group. We expect this to be missing for the control group, because depression data is only collected for those in the condition group; thus it's considered missing at random (MAR). The control group is also missing data for education, marriage, and work. This data could also be considered MAR, because it looks like only number, days, gender, and age were collected for the control group; therefore the data is missing because it's part of the control group. Since we'll be focusing mainly on the Actigraph data for this group, it shouldn't be a concern for the rest of the tutorial. For the condition group, there are three missing melancholia scores, and one missing education range value. It's not immediately clear which kind of missing data it is; after the following step, we'll take a closer look to see if there's any correlation between the condition group's missing data.

Notes

Since we'll be looking at differences between the condition and control group, let's split the scores data into a control group DataFrame and a condition group DataFrame. Even though we could use the numeric indices to select a subset of the DataFrame, in the event there is a greater number of rows / the partition of indices isn't clear, it may be useful to select rows based on the column values (e.g. control or condition) as follows.

We could remove all the NaN columns for the control group since we won't use them in the analysis, but I like to keep them so the columns stay the same as the condition group DataFrame.

As mentioned previously, the condition group has missing data. It doesn't look as though it depends on any data in other columns, so it may be missing completely at random (MCAR). We can use different types of plots to display a correlation between missing values. With only two columns containing missing data, the following missingno heatmap is not too informative, but it could be useful for larger datasets with more missing data.

Since the goal in identifying the type of missing data is to determine if and how it might affect future analyses, let's use seaborn to plot the scores we'll be using in the analysis (MADRS 1 and 2) on the X and Y axis, then see if the missing melancholia values correlate in any way. We can use the color and size to differentiate the NA values, as well as style to mark the education range values.

Other than 2 of the 3 NA points overlapping other plotploints, the figure doesn't show a particular correlation, e.g. that all the missing melancholia scores have the same MADRS 1 or 2 scores, or that all three have the same education range value. As a result, we can move forward with our analysis and treat the missing condition group data as MCAR.

Actigraph Data

Now that we've organized the scores data, we need to get all the Actigraph data. In each part, after getting and managing the control group data, we will repeat the process for the condition group. To store the timestamps and activity data, we'll make a control list and a condition list, and fill them with DataFrames for each csv file (each participant). That way we can easily index the list by the control group. In addition, we'll map the participant number to their row counts (number of Actigraph recordings). This will help later on when selecting a range of data to plot. There is also a check in the code, that will print a message if there are any missing values in the Actigraph data. Lastly, for plotting we'll also create a DataFrame with columns of just the activity data.

Managing and Representing the Data

As the comments in the code describe, we parse the timestamps while reading the csv so they'll be formatted as DateTime objects, which may come in handy later on in the tutorial. So far, we've prepared the data in a couple different ways for later use: a list of DataFrames, a DataFrame each for the control and condition group with the Actigraph data as columns, and we've also stored the row counts (number of Actigraph recordings) for each participant in a dictionary.

Now, let's use matplotlib to get a visual idea of the data we're working with.

We can see from these 3 samples that the Actigraph data is spread over 1 or 2 months, and there is a lot of variation in activity intensity over time. The variation might be something interesting to explore in the next section.

The subplots have their own y-axis tick values, so we need to carefully look at the range if we want to compare between the samples. From this initial glance at the data, we can see the upper range of the control group's activity reaches 2,000 (excluding outliers). Both condition 2 and 3 have a high density of points under 1000, and scattered points and spikes going up to 3000-3500. Although we can't make any assumptions about the group as a whole since we're only looking at three participants, we've seen there are certain days or weeks during the couple months recorded where the activity spikes, and the intensity value is between the 0-8000 range.

Exploratory Data Analysis and Visualization

For the first step in the exploratory data analysis, let's see if we can better understand the range and other statistics about each group using boxplots. It's important to acknowledge the size of our Actigraph datasets (up to 50,000 rows per participant) can create some difficulty when plotting. For example, we saw earlier there are frequent spikes in the activity intensity, thus we can expect there will be many outliers. This may impact the decision on whether or not to display the fliers (points past the end of the whiskers which represent the outliers). In fact, showfliers=False reduced the file size of the .ipynb notebook version of this tutorial by 15mb, so in the end the fliers had to go. For the following boxplot, we'll look at the other information depicted in the boxplots.

From the plots, we can tell the median activity for both groups is quite low; no participant seems to have a median activity much above 100. The control group generally has longer whiskers, likely because there are more outliers, which represents more spikes in activity intensity. All of the 3rd quartiles in the condition group plot and most in the control group plot are less than 500, indicating 75% of activity data is below 500.

Next let's look at the sum of activity data for each participant. This discards any consideration of variation (which we'll be looking at soon), but it will be interesting to learn if there's any correlation between total activity over time and MADRS scores. One way to approach the sums is to select a range of activity data to use; otherwise we'd be adding different amounts of activity data for each participant since there are different date ranges. For this route we'd use the minimum number of rows we found previously, 19299. Some of the participants have 50,000 rows of activity data, so selecting 19299 of the first rows doesn't account for the case where a participant goes from one extreme to another in the middle, i.e. going from being very active to barely active. Because of that, let's normalize the activity sums instead, by summing all the activity data but dividing it by the number of rows of data, to get the average activity value for each participant. This way, we can consider the entire timeframe the participants' activity was being recorded.

The averages for the condition group tends to be less than the control group. There also appears to be more distinct levels of variation among condition group participants. A question to explore further is if these differences can be linked to different affiliation types (1: bipolar II, 2: unipolar depressive, 3: bipolar I). Furthermore, in the condition group, participants 4, 7, 10, 14, and 23 have smaller activity averages; when we analyze the correlation between activity and MADRS scores in the next step, I'd expect these participants to have a higher MADRS score indicating a more severe depression (unless we will not be able to reject the null hypothesis). Since we've averaged the activity data over time, we'll also average the MADRS 1 and 2 scores (score when activity measurement started and stopped).

The participants appear to be clustered in two groups, scores of 0-20 and 22-30. Almost all those I expected to have high scores based on low activity do, except 4. This is our first visualization in the tutorial which shows for the condition group in particular that activity could affect the MADRS score. Looks like we could be on the right track. However, it's not so great for higher activity levels; for example, out of 9, 12, 15, and 22 (the highest average activity levels from the previous step), only one is in the lower score group like we might expect.

A majority of participants have affiliation type 2 (unipolar depressive). There is a pretty even split in affiliation types between the lower and higher score range. This isn't surprising given that we've taken averages of the scores and activity. In fact, it brings us to our next step: visualizing the variation and seeing if there's a correlation between the variation in activity and the MADRS score or affiliation type.

The variance plots are not what we would hope for, since they don't show any obvious trend or cluster of data. The Loess curve, which is a local regression because the curve is weighted toward the nearest data point, shows some relation between the MADRS 1 and MADRS 2 scores, but not much else. Even though I hoped the variance would show some correlation to the MADRS scores, I think the lack of it can be explained by sleep schedules or daily routines. Activity for every participant will likely decrease during the night time hours, and thus the analysis of the variance was not that useful. I think it would be interesting to explore this idea further and do something like the code below, to see if there is a correlation when plotting only certain times of the day. Since we know the circadian rythm can be affected by a depressive episode, it may also be worthwhile to plot the activity in the middle of the night.

Hypothesis Testing and Machine Learning

So far we've compared the activity data between the control and condition group, and we've also looked for correlations in activity and MADRS scores within the condition group. Variance has proved not to be

Conclusion